Overview
Brought to you by YData
Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 397884 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 4835 |
| Duplicate rows (%) | 1.2% |
| Total size in memory | 36.4 MiB |
| Average record size in memory | 96.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Unsupported | 1 |
| Text | 1 |
| DateTime | 1 |
| Categorical | 2 |
| Dataset has 4835 (1.2%) duplicate rows | Duplicates |
Country is highly overall correlated with InvoiceNo | High correlation |
InvoiceNo is highly overall correlated with Country and 2 other fields | High correlation |
Month is highly overall correlated with InvoiceNo | High correlation |
Quantity is highly overall correlated with TotalAmount | High correlation |
TotalAmount is highly overall correlated with Quantity | High correlation |
Year is highly overall correlated with InvoiceNo | High correlation |
Country is highly imbalanced (82.7%) | Imbalance |
Year is highly imbalanced (65.0%) | Imbalance |
Quantity is highly skewed (γ1 = 409.8929717) | Skewed |
UnitPrice is highly skewed (γ1 = 204.0327268) | Skewed |
TotalAmount is highly skewed (γ1 = 451.4431818) | Skewed |
StockCode is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2024-12-15 10:16:14.617700 |
|---|---|
| Analysis finished | 2024-12-15 10:16:29.456263 |
| Duration | 14.84 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
InvoiceNo
Real number (ℝ)
High correlation 
| Distinct | 18532 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 560616.93 |
| Minimum | 536365 |
|---|---|
| Maximum | 581587 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 MiB |
Quantile statistics
| Minimum | 536365 |
|---|---|
| 5-th percentile | 538863 |
| Q1 | 549234 |
| median | 561893 |
| Q3 | 572090 |
| 95-th percentile | 579493 |
| Maximum | 581587 |
| Range | 45222 |
| Interquartile range (IQR) | 22856 |
Descriptive statistics
| Standard deviation | 13106.118 |
|---|---|
| Coefficient of variation (CV) | 0.023378027 |
| Kurtosis | -1.200748 |
| Mean | 560616.93 |
| Median Absolute Deviation (MAD) | 11266 |
| Skewness | -0.17852408 |
| Sum | 2.2306051 × 1011 |
| Variance | 1.7177032 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 576339 | 542 | 0.1% |
| 579196 | 533 | 0.1% |
| 580727 | 529 | 0.1% |
| 578270 | 442 | 0.1% |
| 573576 | 435 | 0.1% |
| 567656 | 421 | 0.1% |
| 567183 | 399 | 0.1% |
| 575607 | 377 | 0.1% |
| 571441 | 364 | 0.1% |
| 570488 | 353 | 0.1% |
| Other values (18522) | 393489 |
| Value | Count | Frequency (%) |
| 536365 | 7 | < 0.1% |
| 536366 | 2 | < 0.1% |
| 536367 | 12 | |
| 536368 | 4 | < 0.1% |
| 536369 | 1 | < 0.1% |
| 536370 | 20 | |
| 536371 | 1 | < 0.1% |
| 536372 | 2 | < 0.1% |
| 536373 | 16 | |
| 536374 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 581587 | 15 | < 0.1% |
| 581586 | 4 | < 0.1% |
| 581585 | 21 | |
| 581584 | 2 | < 0.1% |
| 581583 | 2 | < 0.1% |
| 581582 | 2 | < 0.1% |
| 581581 | 3 | < 0.1% |
| 581580 | 24 | |
| 581579 | 30 | |
| 581578 | 38 |
StockCode
Unsupported
Rejected  Unsupported 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 6.1 MiB |
Description
Text
| Distinct | 3877 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 28 |
| Mean length | 26.677454 |
| Min length | 6 |
Unique
| Unique | 213 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | WHITE HANGING HEART T-LIGHT HOLDER |
|---|---|
| 2nd row | WHITE METAL LANTERN |
| 3rd row | CREAM CUPID HEARTS COAT HANGER |
| 4th row | KNITTED UNION FLAG HOT WATER BOTTLE |
| 5th row | RED WOOLLY HOTTIE WHITE HEART. |
| Value | Count | Frequency (%) |
| of | 40804 | 2.3% |
| set | 40719 | 2.3% |
| bag | 37774 | 2.2% |
| red | 31813 | 1.8% |
| heart | 29307 | 1.7% |
| retrospot | 26336 | 1.5% |
| vintage | 25579 | 1.5% |
| design | 23519 | 1.3% |
| pink | 20142 | 1.2% |
| christmas | 19057 | 1.1% |
| Other values (2179) | 1453890 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1453261 | ||
| E | 951356 | 9.0% |
| A | 806740 | 7.6% |
| T | 705588 | 6.6% |
| R | 678203 | 6.4% |
| O | 634322 | 6.0% |
| I | 580299 | 5.5% |
| S | 571883 | 5.4% |
| N | 528406 | 5.0% |
| L | 519172 | 4.9% |
| Other values (58) | 3185302 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10614532 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1453261 | ||
| E | 951356 | 9.0% |
| A | 806740 | 7.6% |
| T | 705588 | 6.6% |
| R | 678203 | 6.4% |
| O | 634322 | 6.0% |
| I | 580299 | 5.5% |
| S | 571883 | 5.4% |
| N | 528406 | 5.0% |
| L | 519172 | 4.9% |
| Other values (58) | 3185302 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10614532 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1453261 | ||
| E | 951356 | 9.0% |
| A | 806740 | 7.6% |
| T | 705588 | 6.6% |
| R | 678203 | 6.4% |
| O | 634322 | 6.0% |
| I | 580299 | 5.5% |
| S | 571883 | 5.4% |
| N | 528406 | 5.0% |
| L | 519172 | 4.9% |
| Other values (58) | 3185302 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10614532 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1453261 | ||
| E | 951356 | 9.0% |
| A | 806740 | 7.6% |
| T | 705588 | 6.6% |
| R | 678203 | 6.4% |
| O | 634322 | 6.0% |
| I | 580299 | 5.5% |
| S | 571883 | 5.4% |
| N | 528406 | 5.0% |
| L | 519172 | 4.9% |
| Other values (58) | 3185302 |
Quantity
Real number (ℝ)
High correlation  Skewed 
| Distinct | 301 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.988238 |
| Minimum | 1 |
|---|---|
| Maximum | 80995 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 6 |
| Q3 | 12 |
| 95-th percentile | 36 |
| Maximum | 80995 |
| Range | 80994 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 179.33177 |
|---|---|
| Coefficient of variation (CV) | 13.807245 |
| Kurtosis | 178186.24 |
| Mean | 12.988238 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 409.89297 |
| Sum | 5167812 |
| Variance | 32159.886 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 73301 | |
| 12 | 60031 | |
| 2 | 57999 | |
| 6 | 37688 | |
| 4 | 32180 | |
| 3 | 26948 | 6.8% |
| 24 | 23748 | 6.0% |
| 10 | 21212 | 5.3% |
| 8 | 11644 | 2.9% |
| 5 | 8148 | 2.0% |
| Other values (291) | 44985 |
| Value | Count | Frequency (%) |
| 1 | 73301 | |
| 2 | 57999 | |
| 3 | 26948 | 6.8% |
| 4 | 32180 | |
| 5 | 8148 | 2.0% |
| 6 | 37688 | |
| 7 | 1299 | 0.3% |
| 8 | 11644 | 2.9% |
| 9 | 1170 | 0.3% |
| 10 | 21212 | 5.3% |
| Value | Count | Frequency (%) |
| 80995 | 1 | |
| 74215 | 1 | |
| 4800 | 1 | |
| 4300 | 1 | |
| 3906 | 1 | |
| 3186 | 1 | |
| 3114 | 2 | |
| 3000 | 1 | |
| 2880 | 2 | |
| 2700 | 1 |
InvoiceDate
Date
| Distinct | 17282 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 MiB |
| Minimum | 2010-12-01 08:26:00 |
|---|---|
| Maximum | 2011-12-09 12:50:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
UnitPrice
Real number (ℝ)
Skewed 
| Distinct | 440 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.1164878 |
| Minimum | 0.001 |
|---|---|
| Maximum | 8142.75 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 MiB |
Quantile statistics
| Minimum | 0.001 |
|---|---|
| 5-th percentile | 0.42 |
| Q1 | 1.25 |
| median | 1.95 |
| Q3 | 3.75 |
| 95-th percentile | 8.5 |
| Maximum | 8142.75 |
| Range | 8142.749 |
| Interquartile range (IQR) | 2.5 |
Descriptive statistics
| Standard deviation | 22.097877 |
|---|---|
| Coefficient of variation (CV) | 7.0906348 |
| Kurtosis | 58140.397 |
| Mean | 3.1164878 |
| Median Absolute Deviation (MAD) | 1.1 |
| Skewness | 204.03273 |
| Sum | 1240000.6 |
| Variance | 488.31615 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.25 | 45841 | 11.5% |
| 1.65 | 36834 | 9.3% |
| 2.95 | 26562 | 6.7% |
| 0.85 | 25968 | 6.5% |
| 0.42 | 21812 | 5.5% |
| 4.95 | 18122 | 4.6% |
| 3.75 | 17676 | 4.4% |
| 2.1 | 17190 | 4.3% |
| 2.08 | 15745 | 4.0% |
| 1.95 | 12677 | 3.2% |
| Other values (430) | 159457 |
| Value | Count | Frequency (%) |
| 0.001 | 4 | < 0.1% |
| 0.04 | 66 | < 0.1% |
| 0.06 | 112 | < 0.1% |
| 0.07 | 7 | < 0.1% |
| 0.08 | 55 | < 0.1% |
| 0.09 | 2 | < 0.1% |
| 0.1 | 53 | < 0.1% |
| 0.12 | 635 | |
| 0.14 | 87 | < 0.1% |
| 0.16 | 45 | < 0.1% |
| Value | Count | Frequency (%) |
| 8142.75 | 1 | |
| 4161.06 | 2 | |
| 3949.32 | 1 | |
| 3155.95 | 1 | |
| 2500 | 1 | |
| 2382.92 | 1 | |
| 2118.74 | 1 | |
| 2053.07 | 1 | |
| 2033.1 | 1 | |
| 1867.86 | 1 |
CustomerID
Real number (ℝ)
| Distinct | 4338 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15294.423 |
| Minimum | 12346 |
|---|---|
| Maximum | 18287 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 MiB |
Quantile statistics
| Minimum | 12346 |
|---|---|
| 5-th percentile | 12627 |
| Q1 | 13969 |
| median | 15159 |
| Q3 | 16795 |
| 95-th percentile | 17912 |
| Maximum | 18287 |
| Range | 5941 |
| Interquartile range (IQR) | 2826 |
Descriptive statistics
| Standard deviation | 1713.1416 |
|---|---|
| Coefficient of variation (CV) | 0.11201086 |
| Kurtosis | -1.180822 |
| Mean | 15294.423 |
| Median Absolute Deviation (MAD) | 1479 |
| Skewness | 0.025728933 |
| Sum | 6.0854064 × 109 |
| Variance | 2934854 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17841 | 7847 | 2.0% |
| 14911 | 5675 | 1.4% |
| 14096 | 5111 | 1.3% |
| 12748 | 4595 | 1.2% |
| 14606 | 2700 | 0.7% |
| 15311 | 2379 | 0.6% |
| 14646 | 2076 | 0.5% |
| 13089 | 1818 | 0.5% |
| 13263 | 1677 | 0.4% |
| 14298 | 1637 | 0.4% |
| Other values (4328) | 362369 |
| Value | Count | Frequency (%) |
| 12346 | 1 | < 0.1% |
| 12347 | 182 | |
| 12348 | 31 | < 0.1% |
| 12349 | 73 | |
| 12350 | 17 | < 0.1% |
| 12352 | 85 | |
| 12353 | 4 | < 0.1% |
| 12354 | 58 | < 0.1% |
| 12355 | 13 | < 0.1% |
| 12356 | 59 | < 0.1% |
| Value | Count | Frequency (%) |
| 18287 | 70 | < 0.1% |
| 18283 | 756 | |
| 18282 | 12 | < 0.1% |
| 18281 | 7 | < 0.1% |
| 18280 | 10 | < 0.1% |
| 18278 | 9 | < 0.1% |
| 18277 | 8 | < 0.1% |
| 18276 | 14 | < 0.1% |
| 18274 | 11 | < 0.1% |
| 18273 | 3 | < 0.1% |
Country
Categorical
High correlation  Imbalance 
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 MiB |
| United Kingdom | |
|---|---|
| Germany | 9040 |
| France | 8341 |
| EIRE | 7236 |
| Spain | 2484 |
| Other values (32) | 16462 |
Length
| Max length | 20 |
|---|---|
| Median length | 14 |
| Mean length | 13.204638 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | United Kingdom |
|---|---|
| 2nd row | United Kingdom |
| 3rd row | United Kingdom |
| 4th row | United Kingdom |
| 5th row | United Kingdom |
Common Values
| Value | Count | Frequency (%) |
| United Kingdom | 354321 | |
| Germany | 9040 | 2.3% |
| France | 8341 | 2.1% |
| EIRE | 7236 | 1.8% |
| Spain | 2484 | 0.6% |
| Netherlands | 2359 | 0.6% |
| Belgium | 2031 | 0.5% |
| Switzerland | 1841 | 0.5% |
| Portugal | 1462 | 0.4% |
| Australia | 1182 | 0.3% |
| Other values (27) | 7587 | 1.9% |
Length
| Value | Count | Frequency (%) |
| united | 354389 | |
| kingdom | 354321 | |
| germany | 9040 | 1.2% |
| france | 8341 | 1.1% |
| eire | 7236 | 1.0% |
| spain | 2484 | 0.3% |
| netherlands | 2359 | 0.3% |
| belgium | 2031 | 0.3% |
| switzerland | 1841 | 0.2% |
| portugal | 1462 | 0.2% |
| Other values (33) | 9679 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 738932 | |
| i | 718331 | |
| d | 715710 | |
| e | 384188 | |
| m | 365960 | |
| t | 362664 | |
| g | 358036 | |
| o | 357571 | |
| 355299 | ||
| U | 354812 | |
| Other values (30) | 542411 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5253914 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 738932 | |
| i | 718331 | |
| d | 715710 | |
| e | 384188 | |
| m | 365960 | |
| t | 362664 | |
| g | 358036 | |
| o | 357571 | |
| 355299 | ||
| U | 354812 | |
| Other values (30) | 542411 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5253914 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 738932 | |
| i | 718331 | |
| d | 715710 | |
| e | 384188 | |
| m | 365960 | |
| t | 362664 | |
| g | 358036 | |
| o | 357571 | |
| 355299 | ||
| U | 354812 | |
| Other values (30) | 542411 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5253914 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 738932 | |
| i | 718331 | |
| d | 715710 | |
| e | 384188 | |
| m | 365960 | |
| t | 362664 | |
| g | 358036 | |
| o | 357571 | |
| 355299 | ||
| U | 354812 | |
| Other values (30) | 542411 |
Hour
Real number (ℝ)
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.728202 |
| Minimum | 6 |
|---|---|
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.6 MiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 11 |
| median | 13 |
| Q3 | 14 |
| 95-th percentile | 17 |
| Maximum | 20 |
| Range | 14 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.2735189 |
|---|---|
| Coefficient of variation (CV) | 0.17862058 |
| Kurtosis | -0.20969555 |
| Mean | 12.728202 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.18902901 |
| Sum | 5064348 |
| Variance | 5.1688881 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 72065 | |
| 13 | 64026 | |
| 14 | 54118 | |
| 11 | 49084 | |
| 15 | 45369 | |
| 10 | 37997 | |
| 16 | 24089 | 6.1% |
| 9 | 21944 | 5.5% |
| 17 | 13071 | 3.3% |
| 8 | 8690 | 2.2% |
| Other values (5) | 7431 | 1.9% |
| Value | Count | Frequency (%) |
| 6 | 1 | < 0.1% |
| 7 | 379 | 0.1% |
| 8 | 8690 | 2.2% |
| 9 | 21944 | 5.5% |
| 10 | 37997 | |
| 11 | 49084 | |
| 12 | 72065 | |
| 13 | 64026 | |
| 14 | 54118 | |
| 15 | 45369 |
| Value | Count | Frequency (%) |
| 20 | 802 | 0.2% |
| 19 | 3321 | 0.8% |
| 18 | 2928 | 0.7% |
| 17 | 13071 | 3.3% |
| 16 | 24089 | 6.1% |
| 15 | 45369 | |
| 14 | 54118 | |
| 13 | 64026 | |
| 12 | 72065 | |
| 11 | 49084 |
Day
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.042186 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 7 |
| median | 15 |
| Q3 | 22 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.6537465 |
|---|---|
| Coefficient of variation (CV) | 0.57529848 |
| Kurtosis | -1.1728432 |
| Mean | 15.042186 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.11448166 |
| Sum | 5985045 |
| Variance | 74.887328 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 18346 | 4.6% |
| 5 | 16409 | 4.1% |
| 8 | 15854 | 4.0% |
| 7 | 15601 | 3.9% |
| 17 | 14912 | 3.7% |
| 4 | 14880 | 3.7% |
| 20 | 14667 | 3.7% |
| 23 | 14290 | 3.6% |
| 13 | 14172 | 3.6% |
| 14 | 14165 | 3.6% |
| Other values (21) | 244588 |
| Value | Count | Frequency (%) |
| 1 | 13629 | |
| 2 | 12101 | |
| 3 | 10875 | |
| 4 | 14880 | |
| 5 | 16409 | |
| 6 | 18346 | |
| 7 | 15601 | |
| 8 | 15854 | |
| 9 | 12947 | |
| 10 | 14072 |
| Value | Count | Frequency (%) |
| 31 | 6770 | |
| 30 | 10034 | |
| 29 | 8137 | |
| 28 | 13509 | |
| 27 | 12432 | |
| 26 | 8710 | |
| 25 | 12008 | |
| 24 | 12086 | |
| 23 | 14290 | |
| 22 | 12403 |
Month
Real number (ℝ)
High correlation 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.612475 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 8 |
| Q3 | 11 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.4165196 |
|---|---|
| Coefficient of variation (CV) | 0.44880536 |
| Kurtosis | -1.0744883 |
| Mean | 7.612475 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.44480255 |
| Sum | 3028882 |
| Variance | 11.672606 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 64531 | |
| 10 | 49554 | |
| 12 | 43461 | |
| 9 | 40028 | |
| 5 | 28320 | |
| 6 | 27185 | |
| 3 | 27175 | |
| 8 | 27007 | |
| 7 | 26825 | |
| 4 | 22642 | 5.7% |
| Other values (2) | 41156 |
| Value | Count | Frequency (%) |
| 1 | 21229 | |
| 2 | 19927 | |
| 3 | 27175 | |
| 4 | 22642 | |
| 5 | 28320 | |
| 6 | 27185 | |
| 7 | 26825 | |
| 8 | 27007 | |
| 9 | 40028 | |
| 10 | 49554 |
| Value | Count | Frequency (%) |
| 12 | 43461 | |
| 11 | 64531 | |
| 10 | 49554 | |
| 9 | 40028 | |
| 8 | 27007 | |
| 7 | 26825 | |
| 6 | 27185 | |
| 5 | 28320 | |
| 4 | 22642 | 5.7% |
| 3 | 27175 |
Year
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 MiB |
| 2011 | |
|---|---|
| 2010 | 26157 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2010 |
|---|---|
| 2nd row | 2010 |
| 3rd row | 2010 |
| 4th row | 2010 |
| 5th row | 2010 |
Common Values
| Value | Count | Frequency (%) |
| 2011 | 371727 | |
| 2010 | 26157 | 6.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2011 | 371727 | |
| 2010 | 26157 | 6.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 769611 | |
| 0 | 424041 | |
| 2 | 397884 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1591536 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 769611 | |
| 0 | 424041 | |
| 2 | 397884 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1591536 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 769611 | |
| 0 | 424041 | |
| 2 | 397884 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1591536 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 769611 | |
| 0 | 424041 | |
| 2 | 397884 |
TotalAmount
Real number (ℝ)
High correlation  Skewed 
| Distinct | 2939 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.397 |
| Minimum | 0.001 |
|---|---|
| Maximum | 168469.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 MiB |
Quantile statistics
| Minimum | 0.001 |
|---|---|
| 5-th percentile | 1.25 |
| Q1 | 4.68 |
| median | 11.8 |
| Q3 | 19.8 |
| 95-th percentile | 67.5 |
| Maximum | 168469.6 |
| Range | 168469.6 |
| Interquartile range (IQR) | 15.12 |
Descriptive statistics
| Standard deviation | 309.07104 |
|---|---|
| Coefficient of variation (CV) | 13.799663 |
| Kurtosis | 232155.12 |
| Mean | 22.397 |
| Median Absolute Deviation (MAD) | 7.55 |
| Skewness | 451.44318 |
| Sum | 8911407.9 |
| Variance | 95524.909 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 15 | 20082 | 5.0% |
| 17.7 | 9174 | 2.3% |
| 16.5 | 8490 | 2.1% |
| 10.2 | 8028 | 2.0% |
| 19.8 | 7625 | 1.9% |
| 1.25 | 7552 | 1.9% |
| 3.75 | 6847 | 1.7% |
| 1.65 | 5751 | 1.4% |
| 10.5 | 5550 | 1.4% |
| 20.8 | 5524 | 1.4% |
| Other values (2929) | 313261 |
| Value | Count | Frequency (%) |
| 0.001 | 4 | < 0.1% |
| 0.06 | 1 | < 0.1% |
| 0.08 | 1 | < 0.1% |
| 0.1 | 3 | < 0.1% |
| 0.12 | 24 | < 0.1% |
| 0.14 | 4 | < 0.1% |
| 0.16 | 1 | < 0.1% |
| 0.18 | 2 | < 0.1% |
| 0.19 | 95 | |
| 0.21 | 109 |
| Value | Count | Frequency (%) |
| 168469.6 | 1 | |
| 77183.6 | 1 | |
| 38970 | 1 | |
| 8142.75 | 1 | |
| 7144.72 | 1 | |
| 6539.4 | 2 | |
| 4992 | 1 | |
| 4921.5 | 1 | |
| 4632 | 1 | |
| 4522.5 | 1 |
Interactions
Correlations
| Country | CustomerID | Day | Hour | InvoiceNo | Month | Quantity | TotalAmount | UnitPrice | Year | |
|---|---|---|---|---|---|---|---|---|---|---|
| Country | 1.000 | 0.301 | 0.062 | 0.086 | 0.976 | 0.065 | 0.000 | 0.000 | 0.042 | 0.055 |
| CustomerID | 0.301 | 1.000 | -0.003 | 0.057 | 0.001 | 0.033 | -0.154 | -0.174 | -0.012 | 0.054 |
| Day | 0.062 | -0.003 | 1.000 | 0.013 | 0.086 | -0.148 | 0.004 | 0.001 | -0.004 | 0.188 |
| Hour | 0.086 | 0.057 | 0.013 | 1.000 | 0.052 | 0.060 | -0.151 | -0.168 | -0.008 | 0.037 |
| InvoiceNo | 0.976 | 0.001 | 0.086 | 0.052 | 1.000 | 0.641 | -0.034 | -0.067 | -0.047 | 0.976 |
| Month | 0.065 | 0.033 | -0.148 | 0.060 | 0.641 | 1.000 | -0.057 | -0.073 | -0.021 | 0.435 |
| Quantity | 0.000 | -0.154 | 0.004 | -0.151 | -0.034 | -0.057 | 1.000 | 0.657 | -0.408 | 0.000 |
| TotalAmount | 0.000 | -0.174 | 0.001 | -0.168 | -0.067 | -0.073 | 0.657 | 1.000 | 0.349 | 0.000 |
| UnitPrice | 0.042 | -0.012 | -0.004 | -0.008 | -0.047 | -0.021 | -0.408 | 0.349 | 1.000 | 0.000 |
| Year | 0.055 | 0.054 | 0.188 | 0.037 | 0.976 | 0.435 | 0.000 | 0.000 | 0.000 | 1.000 |
Missing values
Sample
| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | Hour | Day | Month | Year | TotalAmount | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 536365 | 85123A | WHITE HANGING HEART T-LIGHT HOLDER | 6 | 2010-12-01 08:26:00 | 2.55 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 15.30 |
| 1 | 536365 | 71053 | WHITE METAL LANTERN | 6 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 20.34 |
| 2 | 536365 | 84406B | CREAM CUPID HEARTS COAT HANGER | 8 | 2010-12-01 08:26:00 | 2.75 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 22.00 |
| 3 | 536365 | 84029G | KNITTED UNION FLAG HOT WATER BOTTLE | 6 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 20.34 |
| 4 | 536365 | 84029E | RED WOOLLY HOTTIE WHITE HEART. | 6 | 2010-12-01 08:26:00 | 3.39 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 20.34 |
| 5 | 536365 | 22752 | SET 7 BABUSHKA NESTING BOXES | 2 | 2010-12-01 08:26:00 | 7.65 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 15.30 |
| 6 | 536365 | 21730 | GLASS STAR FROSTED T-LIGHT HOLDER | 6 | 2010-12-01 08:26:00 | 4.25 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 25.50 |
| 7 | 536366 | 22633 | HAND WARMER UNION JACK | 6 | 2010-12-01 08:28:00 | 1.85 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 11.10 |
| 8 | 536366 | 22632 | HAND WARMER RED POLKA DOT | 6 | 2010-12-01 08:28:00 | 1.85 | 17850.0 | United Kingdom | 8 | 1 | 12 | 2010 | 11.10 |
| 9 | 536367 | 84879 | ASSORTED COLOUR BIRD ORNAMENT | 32 | 2010-12-01 08:34:00 | 1.69 | 13047.0 | United Kingdom | 8 | 1 | 12 | 2010 | 54.08 |
| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | Hour | Day | Month | Year | TotalAmount | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 541899 | 581587 | 22726 | ALARM CLOCK BAKELIKE GREEN | 4 | 2011-12-09 12:50:00 | 3.75 | 12680.0 | France | 12 | 9 | 12 | 2011 | 15.00 |
| 541900 | 581587 | 22730 | ALARM CLOCK BAKELIKE IVORY | 4 | 2011-12-09 12:50:00 | 3.75 | 12680.0 | France | 12 | 9 | 12 | 2011 | 15.00 |
| 541901 | 581587 | 22367 | CHILDRENS APRON SPACEBOY DESIGN | 8 | 2011-12-09 12:50:00 | 1.95 | 12680.0 | France | 12 | 9 | 12 | 2011 | 15.60 |
| 541902 | 581587 | 22629 | SPACEBOY LUNCH BOX | 12 | 2011-12-09 12:50:00 | 1.95 | 12680.0 | France | 12 | 9 | 12 | 2011 | 23.40 |
| 541903 | 581587 | 23256 | CHILDRENS CUTLERY SPACEBOY | 4 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 12 | 9 | 12 | 2011 | 16.60 |
| 541904 | 581587 | 22613 | PACK OF 20 SPACEBOY NAPKINS | 12 | 2011-12-09 12:50:00 | 0.85 | 12680.0 | France | 12 | 9 | 12 | 2011 | 10.20 |
| 541905 | 581587 | 22899 | CHILDREN'S APRON DOLLY GIRL | 6 | 2011-12-09 12:50:00 | 2.10 | 12680.0 | France | 12 | 9 | 12 | 2011 | 12.60 |
| 541906 | 581587 | 23254 | CHILDRENS CUTLERY DOLLY GIRL | 4 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 12 | 9 | 12 | 2011 | 16.60 |
| 541907 | 581587 | 23255 | CHILDRENS CUTLERY CIRCUS PARADE | 4 | 2011-12-09 12:50:00 | 4.15 | 12680.0 | France | 12 | 9 | 12 | 2011 | 16.60 |
| 541908 | 581587 | 22138 | BAKING SET 9 PIECE RETROSPOT | 3 | 2011-12-09 12:50:00 | 4.95 | 12680.0 | France | 12 | 9 | 12 | 2011 | 14.85 |
Duplicate rows
Most frequently occurring
| InvoiceNo | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | Hour | Day | Month | Year | TotalAmount | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1612 | 555524 | PINK REGENCY TEACUP AND SAUCER | 1 | 2011-06-05 11:37:00 | 2.95 | 16923.0 | United Kingdom | 11 | 5 | 6 | 2011 | 2.95 | 20 |
| 1611 | 555524 | GREEN REGENCY TEACUP AND SAUCER | 1 | 2011-06-05 11:37:00 | 2.95 | 16923.0 | United Kingdom | 11 | 5 | 6 | 2011 | 2.95 | 12 |
| 3194 | 572861 | PURPLE DRAWERKNOB ACRYLIC EDWARDIAN | 12 | 2011-10-26 12:46:00 | 1.25 | 14102.0 | United Kingdom | 12 | 26 | 10 | 2011 | 15.00 | 8 |
| 345 | 538514 | BATH BUILDING BLOCK WORD | 1 | 2010-12-12 14:27:00 | 5.95 | 15044.0 | United Kingdom | 14 | 12 | 12 | 2010 | 5.95 | 6 |
| 478 | 540524 | BATH BUILDING BLOCK WORD | 1 | 2011-01-09 12:53:00 | 5.95 | 16735.0 | United Kingdom | 12 | 9 | 1 | 2011 | 5.95 | 6 |
| 535 | 541266 | HOME BUILDING BLOCK WORD | 1 | 2011-01-16 16:25:00 | 5.95 | 15673.0 | United Kingdom | 16 | 16 | 1 | 2011 | 5.95 | 6 |
| 536 | 541266 | LOVE BUILDING BLOCK WORD | 1 | 2011-01-16 16:25:00 | 5.95 | 15673.0 | United Kingdom | 16 | 16 | 1 | 2011 | 5.95 | 6 |
| 1089 | 547651 | METAL SIGN,CUPCAKE SINGLE HOOK | 1 | 2011-03-24 12:11:00 | 1.25 | 16904.0 | United Kingdom | 12 | 24 | 3 | 2011 | 1.25 | 6 |
| 3140 | 572344 | Manual | 48 | 2011-10-24 10:43:00 | 1.50 | 14607.0 | United Kingdom | 10 | 24 | 10 | 2011 | 72.00 | 6 |
| 4233 | 578289 | BELLE JARDINIERE CUSHION COVER | 1 | 2011-11-23 14:07:00 | 3.75 | 17841.0 | United Kingdom | 14 | 23 | 11 | 2011 | 3.75 | 6 |